Separation of Deep UV Resonance Raman Spectra for Pure Protein Secondary Structures based on D-H Exchange Data
نویسنده
چکیده
This project elaborates a Bayesian-based approach for extracting resonance Raman spectra of highly-ordered β-sheet structure of amyloid fibrils. The proposed algorithm incorporates prior information about characteristic spectral bands using the signal dictionary approach and information about the concentration matrix by searching over the space of template mixing matrices. Upon further improvement, the algorithm can be specifically used for extracting spectra of species present in small fractions in IR, Raman, NMR, and MS spectral mixtures. Introduction. A chain of protein molecules can adopt three different conformations or secondary structures called random coil, α-helix, and β-sheet. Each protein can be thought of as a combination of these three secondary structures. Moreover, the dataset of deep UV resonance Raman spectra of hundreds different proteins can be fitted with three component spectra. This suggests that three pure secondary structure spectra are the same for most proteins. Knowledge of these three component spectra would allow for finding the percentage of the secondary structures in a protein by the least-squares fitting of its Raman spectrum with those three. The pure secondary spectra are not observable experimentally since no protein consists of 100 % of particular secondary structure. To tackle this difficulty we attempted to extract the latent pure component spectra using the Bayesian curve resolution approach. Experimental Part. Amyloid fibrils are the specific form of protein composed of the highly ordered β-sheet core surrounded by random coil part. The contribution of the third component, i.e. α-helix is negligible and can be disregarded. Figure 1 shows the structure of amyloid fibrils. Random coil parts are exposed to water while β-sheet of fibrils is knows to be buried and inaccessible to the aqueous solution. If one replaces water by deuterium water H-s of the random coil will be substituted by D-s resulting in down-shift of all Raman bands involving N-H vibrations. The N-H vibration bands of buried β-sheet will not affected by deuterium exchange and therefore will retain their positions. This allows separating β-sheet and random coil structures gradually changing H2O/D2O ratio in solution. Fifty samples of fibrils were prepared starting with fibrils in 100% H2O, 98% H2O plus 2% D2O, 96% H2O plus 4% D2O, .... and 100% D2O to acquire 50 Raman spectra. Spectra of H20, D20, 50/50 (H2O + D2O) mixtures were recorded separately. Figure 1. Structure of amyloid fibrils. Core β-sheet is shown with saw-shaped purple lines Theoretical Part. The chosen experimental procedure permitted the prediction of the contribution of species contributing to spectra of fibril. (a) Anticipating H2O, D2O, HOD contributions in each sample. H-s and D-s from H2O and D2O molecules readily interchange to form mixed HOD molecules. If the total fraction of protons in the H2O-D2O is q then the probabilities of forming H2O, D2O, HOD are as follows: Probability Possible Combinations Statistical Weight P(H2O |q, I) ~ q H-O-H 1 P(D2O |q, I)~ (1-q) D-O-D 1 P(DOH | q,I)~ 2*(1-q)*q D-O-H, H-O-D 2 Simulated concentration fractions of H2O, D2O, HOD versus the fraction of added D2O are shown in Figure 2. Figure 2. Anticipated concentrations of all specie all components. (b) Anticipating random coil(H) and random coil(D) contributions in each sample. Fractions of N-H bonds and N-H bonds in the random coil part are proportion to the total concentration of H-s and D-s, respectively, i.e. they follow linearly the fraction of added D2O (Figure 2). (c) Anticipating contribution of the other components. Each fibril sample was prepared from the same stock fibril material. This implies that the fraction of the β-sheet core with respect to the random coil part is constant across all samples, i.e. Frac(β-sheet) /(Frac(coil(H))+Frac(coil(D))) ~const All spectra exhibit the admixture of quartz signal (spectra were recorded in a quartz tube) and atmospheric oxygen. Both fractions are random (Figure 2). 0 10 20 30 40 50 60 70 80 90 100 0 H20 D20 H0D random part H random part D Fibril core Fraction of D20, % Quartz and Oxygen Table 1 below summarizes available prior information about all the components Table 1. Prior information on the spectra and concentrations of components. Pure component Spectrum Concentration profile H20 Known Shape is known H0D Known Shape is known D20 Known Shape is known Unordered part, H substituted Unknown Shape is known Unordered part, H substituted Unknown Shape is known Fibril core Unknown Shape is known Quartz Known Small random contribution Oxygen (molecular) Known Small random contribution Bayesian Approach. The source separation problem is as follows Data=C· S+E (1) Where C is the concentration matrix, S is the matrix of pure component spectra and E is error where random or systematic. The matrix Data is known while the matrices C and S are to be estimated. The Bayes theorem for problem (1) is written as I) | P(S I) | P(C I) S, C, | (Data P ~ ) I Data, | S P(C, (2) where P (Data | C,S, I) is the likelihood controlling the quality of fitting and P(S | I) and P(C | I) are prior probabilities for spectra and concentrations. Because finding either matrix C or S alone is enough for solving problem (1) the concentration matrix C is normally sought since it contains by far fewer elements. It was shown[1] that in the case of uniform prior for concentration matrix and independent sources the probability of the concentration matrix is given by l l l i k ik i s p S C Data ds I Data C P ) ( ) ( ~ ) , | ( (3) which in the case of noise-free data reduces to the logarithmic probability l l l s p W I Data C P )) ( log( )) log(det( ) , | ( (4) where W is the separation matrix such that Data W S . Incorporating prior information about the C matrix. The actual concentration matrix should have columns proportional to the columns of the matrix sketched in Figure 2. Eight hyper-parameters αi then need to be estimated so that Cj = αi·Tij with j=1:8 and T is the template matrix shown in Figure 1. The parameters αi account for the spectral fraction of each component in the mixtures (they are proportional but not equal to the physical concentrations and therefore must be found). For example, e.g. α1 / α6 gives the spectral fraction of fibril core with respect to that of water and is proportional to the concentration of protein in samples. The concentration of protein is assumed to be equal in all samples since they were prepared based on the same stock solution. As seen from Figure 2, contributions of quartz and oxygen are random and to be rigorous, 50 additional parameters need to be assigned for fraction of quartz in each sample and another 50 for oxygen. These are the nuisance parameters in the model. At this stage, however, we assume constant but unknown fractions of quartz and oxygen in each. After all relevant parameters are found matrix least-squares will refine those small quartz and oxygen contribution in each sample. The posterior for the concentration matrix then takes the form ||} ) ( log{|| 2 )) ( log( )) log(det( ) , | ( T C n m s p W I Data C P l l l (5) where α is diagonal matrix with parameters αi on its diagonal and || || stands for Frobenius norm, m is a number of experimental (m=50) spectra and n is a number of pure components (n=8). Incorporating prior information about the S matrix. In our model we have 5 pure spectra known from the experiment. It is straightforward to assign inner product of the known spectrum and the resolved spectrum for known components as their prior probabilities. Indeed, inner product equals 1 if spectra completely overlap and 0 if they have no overlapping regions. For the other three spectra P(sij) was set proportional to the reciprocal of sij. P(sij)~1/sij. The total posterior probability then transforms into (6) As seen from (6) assumption P(sij)~1/sij for unknown spectra resulted in addition of known as a sparsity constraint. It controls the area of resolved spectra and eliminates extraneous bands appearing as admixtures from the other spectra. It turned out in the course of optimization that resolved spectra had characteristic bands whose shapes were close to what we expected for these components. The space in between those bands contained admixtures from other components and /or noise. Alternatively, the characteristic bands in the sought spectra were obtained using pure variable approach[2]. The latter analyzes second derivative or fourth derivative spectra where even very overlapping bands are seen as distinct sharp peaks. The purest variable is such a wavenumber at which the contribution of an individual component to the Raman intensity is maximal while the contributions from the other components are minimal. For a Raman spectrum of each sample, the intensity at a particular purest variable is approximated to be proportional to the concentration of a corresponding individual component in the sample. Consequently, the matrix of the Raman intensities at all purest variables Cint can be used as a concentration matrix C of the components. The shapes of spectral band in normal (not second derivative) space are then found as S = Data T Cint ( CintCint ) (7) Knowledge of characteristic bands allows modeling the unknown spectrum as a linear combination of the non-overlapping bands which are referred to as a dictionary bands[3]. m
منابع مشابه
UV resonance Raman-selective amide vibrational enhancement: quantitative methodology for determining protein secondary structure.
We have directly determined the amide band resonance Raman spectra of the "average" pure alpha-helix, beta-sheet, and unordered secondary structures by exciting within the amide pi-->pi* transitions at 206.5 nm. The Raman spectra are dominated by the amide bands of the peptide backbone. We have empirically determined the average pure alpha-helix, beta-sheet, and unordered resonance Raman spectr...
متن کاملDeep UV resonance Raman spectroscopy of β-sheet amyloid fibrils: a QM/MM simulation.
We present a combined quantum mechanics and molecular mechanics study of the deep ultraviolet ππ* resonance Raman spectra of β-sheet amyloid fibrils Aβ(34-42) and Aβ(1-40). Effects of conformational fluctuations are described using a Ramachandran angle map, thus avoiding repeated ab initio calculations. Experimentally observed effects of hydrogen-deuterium exchange are reproduced. We propose th...
متن کاملR-Helix Peptide Folding and Unfolding Activation Barriers: A Nanosecond UV Resonance Raman Study
We used UV resonance Raman spectroscopy to characterize the equilibrium conformation and the kinetics of thermal denaturation of a 21 amino acid, mainly alanine, R-helical peptide (AP). The 204-nm UV resonance Raman spectra show selective enhancements of the amide vibrations, whose intensities and frequencies strongly depend on the peptide secondary structure. These AP Raman spectra were accura...
متن کاملProbing a fibrillation nucleus directly by deep ultraviolet Raman spectroscopy.
Understanding the biochemical mechanism of amyloid fibrillation is one of the most intriguing and pressing problems in modern biology and medicine.1 On the basis of kinetic studies of fibril formation, several hypothetical mechanisms for fibrillation have been proposed recently.2 It is well accepted now that the fibrillation starts with a thermodynamically unfavorable nucleation step followed b...
متن کاملElucidating Peptide and Protein Structure and Dynamics: UV Resonance Raman Spectroscopy.
UV resonance Raman spectroscopy (UVRR) is a powerful method that has the requisite selectivity and sensitivity to incisively monitor biomolecular structure and dynamics in solution. In this perspective, we highlight applications of UVRR for studying peptide and protein structure and the dynamics of protein and peptide folding. UVRR spectral monitors of protein secondary structure, such as the A...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010